7 research outputs found

    High-level and Low-level Feature Set for Image Caption Generation with Optimized Convolutional Neural Network, Journal of Telecommunications and Information Technology, 2022, nr 4

    Get PDF
    Automatic creation of image descriptions, i.e. captioning of images, is an important topic in artificial intelligence (AI) that bridges the gap between computer vision (CV) and natural language processing (NLP). Currently, neural networks are becoming increasingly popular in captioning images and researchers are looking for more efficient models for CV and sequence-sequence systems. This study focuses on a new image caption generation model that is divided into two stages. Initially, low-level features, such as contrast, sharpness, color and their high-level counterparts, such as motion and facial impact score, are extracted. Then, an optimized convolutional neural network (CNN) is harnessed to generate the captions from images. To enhance the accuracy of the process, the weights of CNN are optimally tuned via spider monkey optimization with sine chaotic map evaluation (SMO-SCME). The development of the proposed method is evaluated with a diversity of metrics

    A Comprehensive Review and Open Challenges on Visual Question Answering Models

    Get PDF
    Users are now able to actively interact with images and pose different questions based on images, thanks to recent developments in artificial intelligence. In turn, a response in a natural language answer is expected. The study discusses a variety of datasets that can be used to examine applications for visual question-answering (VQA), as well as their advantages and disadvantages. Four different forms of VQA models—simple joint embedding-based models, attention-based models, knowledge-incorporated models, and domain-specific VQA models—are in-depth examined in this article. We also critically assess the drawbacks and future possibilities of all current state-of-the-art (SoTa), end-to-end VQA models. Finally, we present the directions and guidelines for further development of the VQA models

    High-level and Low-level Feature Set for Image Caption Generation with Optimized Convolutional Neural Network

    No full text
    Automatic creation of image descriptions, i.e. captioning of images, is an important topic in artificial intelligence (AI) that bridges the gap between computer vision (CV) and natural language processing (NLP). Currently, neural networks are becoming increasingly popular in captioning images and researchers are looking for more efficient models for CV and sequence-sequence systems. This study focuses on a new image caption generation model that is divided into two stages. Initially, low-level features, such as contrast, sharpness, color and their high-level counterparts, such as motion and facial impact score, are extracted. Then, an optimized convolutional neural network (CNN) is harnessed to generate the captions from images. To enhance the accuracy of the process, the weights of CNN are optimally tuned via spider monkey optimization with sine chaotic map evaluation (SMO-SCME). The development of the proposed method is evaluated with a diversity of metrics

    The role of centralized reading of endoscopy in a randomized controlled trial of mesalamine for ulcerative colitis

    No full text
    Background & Aims: Interobserver differences in endoscopic assessments contribute to variations in rates of response to placebo in ulcerative colitis (UC) trials. We investigated whether centralized review of images could reduce these variations. Methods: We performed a 10-week, randomized, double-blind, placebo-controlled study of 281 patients with mildly to moderately active UC, defined by an Ulcerative Colitis Disease Activity Index (UCDAI) sigmoidoscopy score ≥2, that evaluated the efficacy of delayed-release mesalamine (Asacol 800-mg tablet) 4.8 g/day. Endoscopic images were reviewed by a single expert central reader. The primary outcome was clinical remission (UCDAI, stool frequency and bleeding scores of 0, and no fecal urgency) at week 6. Results: The primary outcome was achieved by 30.0% of patients treated with mesalamine and 20.6% of those given placebo, a difference of 9.4% (95% confidence interval [CI], -0.7% to 19.4%; P =.069). Significant differences in results from secondary analyses indicated the efficacy of mesalamine. Thirty-one percent of participants, all of whom had a UCDAI sigmoidoscopy score ≥2 as read by the site investigator, were considered ineligible by the central reader. After exclusion of these patients, the remission rates were 29.0% and 13.8% in the mesalamine and placebo groups, respectively (difference of 15%; 95% CI, 3.5%-26.0%; P =.011). Conclusions: Although mesalamine 4.8 g/day was not statistically different from placebo for induction of remission in patients with mildly to moderately active UC, based on an intent-to-treat analysis, the totality of the data supports a benefit of treatment. Central review of endoscopic images is critical to the conduct of induction studies in UC; ClinicalTrials.gov Number, NCT01059344. © 2013 by the AGA Institute
    corecore